Research Article | Open Access
Volume 2025 |Article ID 100094 | https://doi.org/10.1016/j.plaphe.2025.100094

ChatLeafDisease: a chain-of-thought prompting approach for crop disease classification using large language models

Jiandong Pan,1,4 Renhai Zhong,2,4 Fulin Xia,1 Jingfeng Huang,2 Linchao Zhu,3 Yi Yang,3 and Tao Lin 1

1College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, Zhejiang, 310058, China
2Institute of Applied Remote Sensing and Information Technology, Zhejiang University, Hangzhou, Zhejiang, 310058, China
3College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, 310058, China
4Jiandong Pan and Renhai Zhong contributed equally to this work.

Received 
25 Jan 2025
Accepted 
16 Jul 2025
Published
07 Aug 2025

Abstract

Accurate crop disease classification is essential for disease management to support food security. Deep learning has shown its high classification accuracy in image-based disease identification. However, the deep learning approach usually needs large amounts of data for training to achieve satisfactory performance, which hindering its application and scalability for different crops. Large language models (LLMs) have shown strong generation capability and zero-shot performance. While how to utilize the LLM technique for crop disease classification remains unclear. In this study, we developed a training-free framework named ChatLeafDisease (ChatLD) based on GPT-4o model with chain-of-thought (CoT) prompting for crop disease classification. The framework includes a disease description database to provide knowledge of crop diseases and a disease classification agent guided by CoT prompts to understand the patterns of leaves infected diseases and classify the disease. The original GPT-4o model, Gemini model, and Contrastive Language-Image Pre-training (CLIP) model were chosen as baselines. Results showed that the ChatLD framework achieved higher and more stable classification accuracy (88.9 %) for six tomato diseases than the GPT-4o (45.9 %), Gemini (56.1%), and CLIP (64.3 %) models. We found that the scoring rules enabled the ChatLD framework to capture the typical differences across diseases. Ablation results showed that the CoT prompts integrated the scoring rules and important notes to enable the ChatLD to achieve high classification accuracy. Comparison between different description texts showed that condensed disease description improved the classification performance. The results showed that the ChatLD framework achieved high accuracy for the disease classes of new crops, highlighting its scalability across various crop diseases. The proposed framework provided a new LLM-based alternative for crop disease classification by only using the textual descriptions of disease without training process.

© 2019-2023   Plant Phenomics. All rights Reserved.  ISSN 2643-6515.

Back to top